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Abstract 

We consider a general system of n noninteracting identical particles which evolve under a 
given dynamical law and whose initial microstates are a priori independent. The time evolution 
of the n-particle average of a bounded function on the particle microstates is then examined in 
the large n limit. Using the theory of large deviations, we show that if the initial macroscopic 
average is constrained to be near a given value, y, then the macroscopic average at time t 
converges in probability as 71 oo to a value, ^t{y), given explicitly in terms of a canonical 
expectation. Some general features of the graph of tpt{y) versus t are examined, particularly in 
regard to continuity, symmetry, and convergence. 

Key words: determinism, causality, large deviation theory, many-particle systems, fluctuations, 
nonequilibrium statistical mechanics, kinetic theory 



1 Introduction 

The emergence of determinism in the macroscopic variables of a system from its underlying mi- 
croscopic dynamics has been a subject of great importance in the field of statistical mechanics. 
If one supposes the microscopic variables are rigidly deterministic, then the time evolution of the 
macroscopic variables as a function of the initial microstate is of course rigidly deterministic as well. 
Considered as a function of the initial macrostate, however, its time evolution fails to be determin- 
istic due to the multiplicity of microstates consistent with the given macrostate. This leads to the 
familiar ensemble description, as described by a conditional a priori measure. Given the highly 
irregular behavior of many macroscopic systems on the microscopic level, however, it may appear 
that all deterministic behavior is utterly lost on the macroscopic level. 

By contrast, many macroscopic systems are described quite well by deterministic models, even 
if they are not deterministic in the strict sense. Well known examples include thermal conduction, 
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diffusion, hydrodynamics, and chemical reactions. Characteristic of many such systems is the pres- 
ence of extensive macroscopic variables which are sums or averages over a great many microscopic 
quantities. The task of deriving differential equations for such variables has been taken up by several 
researchers ||, and here Markov processes have played an important role. In this approach, one 
derives a coupled set of differential equations for the moments of the macroscopic variable, where 
the deterministic behavior is given by the first moment when the dispersion goes to zero. Since 
few physical observables are truly Markovian, delicate scaling between vanishing interactions and 
dilating time scales must be used to obtain an asymptotically Markovian process jj, ||. However, 
the consistency of such assumptions with the underlying microscopic dynamics and the relevance of 
strong interactions have been repeatedly called into question |6[ 0|. In particular, the relaxation of 
macroscopic systems to a state of equilibrium may seem inconsistent with the underlying reversible 
microscopic dynamics. 

Wc take a somewhat different and more general view upon this problem. Instead of attempting 
to derive differential equations of motion for a class of macroscopic variables, we consider instead 
the existence and character of an emergent form of determinism for large systems, which we call 
macroscopic determinism. By this we mean that if the macrostate is initially constrained to be 
near a given value, y, then there exists a map -04 such that the probability that the macrostate is 
near ipt{y) at time t approaches one as the number of particles approaches infinity. For simplicity, 
we restrict attention to systems of dynamically noninteracting particles and macroscopic variables 
which take the form of averages over these microscopic quantities. The mathematical theory of large 
deviations, an extension of the law of large numbers, provides an useful tool for addressing such 
questions and is used in Section ^to obtain the main result. Our primary contribution has been to 
apply well-known equilibrium results to systems initially out of equilibrium. 

The collection of macrostates {i'tiv) ■ t ^ T} for a given y and set of times T constitutes a 
collection of highly probable states, which we call the deterministic curve, akin to the concentration 
curve of P. and T. Ehrenfest in their discussion of Boltzmann's H-theorem ||]. In Section ^ we 
show that the macroscopic average converges to this curve in probability for any given finite, and 
in some cases infinite, set of times, though large deviations from this curve will persist whenever 
the number of particles is finite. For certain reversible microscopic dynamics which preserve the a 
priori measure, the deterministic curve of a restricted class of macroscopioc variables is symmetric 
in time about the initial specification of the macroscopic variable. Furthermore, if the microscopic 
dynamical law is mixing, then the deterministic curve converges as t ±00 to a value independent 
of y, a situation corresponding to equilibration of the macrostate. 



2 Problem Description 

We begin with a mathematical description of the relevant physical quantities. The microstate space 
of a single particle is denoted by X, while the microstate space of the total n-particle system is 
given by the Cartesian product X" = Xx •"• xX. The projection map tt^ is defined such that, 
if (a;i, . . . , Xn) € X^ is the microstate of the system, then TTi{xi, . . . , Xn) = Xi is the microstate of 
the i^^ particle. To apply the techniques of large deviation theory we shall need to make the mild 
assumption that {X, d) is a Pohsh space, i.e. a complete separable metric space, with respect to some 
given metric d, and denote by A the set of Borel subsets of X generated by the metric topology. 

Let V{X) denote the set of Borel probability measures on X and note that V{X) is itself a 
Polish space under the Prohorov metric p induced by the metric d on X [0, p. 317]. Denote by 
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/i € ViX) the a priori probability measure for the initial microstate of a given particle. In other 
words, for C G A, /i[C] is the probability that a given particle's initial microstate is in C C X, before 
any conditioning on the macrostate has taken place. All particle microstates therefore have equal a 
priori probability in the sense that they arc uniformly distributed with respect to fi. Usually, /i is 
taken to be an invariant measure. The system microstates are assumed to be a priori independent 
and identically distributed (i.i.d.) with marginal fx. Thus, the a priori distribution on X" is given 
by the product measure jji^ = fxx x/x. 

Let g : X ^ Y he a, mcasiirable function which is bounded and continuous /i-almost every- 
where (a.e.). For a given particle microstate Xi G X, g{xi) gives the corresponding single particle 
macrostate. For a collection of n particles, the macroscopic average G : X" — > F of is given by 

n 

G:=-Y,9°n- (1) 

2 — 1 

It is to this macroscopic variable that we will focus our attention. We shall take F to be a set of real 
numbers equipped with the Euclidean norm. To accommodate all possible values of G as well as any 

accumulation points, we shall assume Y = [t/min, ?/max], where r/min = inf j/max = sup g(X), 

and g{X) is the image of X under g. Note that G is indeed a measurable function since it is a finite 
sum of measurable functions. 

The dynamics of the system are described by a family of measurable transformations $t on X" 
indexed by the time parameter < e T C M, where $o is the identity map. The macroscopic average at 
time t is thus given by Gt := Go^^. We shall suppose that the particles are dynamically independent 
and identical in the sense that 

<I>t = (</?t0 7ri,...,<^t0 7r„), (2) 

where cpt is a measiuable transformation on X. Wc shall further suppose that Lpt is continuous /i-a.e. 
on X for each t ^ T, though not necessarily continuous on T for /i-a.c. x G X. 

The macrostate Gt may be considered a sample mean over the a priori i.i.d. random variables 
{(ft o TTi}^^-^ with common marginal iJ,o[go(ft)~^ ■ Since g is bounded, the weak law of large numbers 
implies Gt converges in probability to the expectation value 5 o d/x as n ^ oo; in other words, 

lim e B] = ( °' !^f^°^*^^^5; (3) 

n^oo '[ 1, It J g o ipt d/j, G B ^ ' 

for any Borel set B CY, where B° is the interior of B and B is its closure. No limit is specified for 

points on the boimdary, dB, of B. 

If the initial microstates are restricted so as to satisfy some initial macroscopic constraint, then 
the variables {(ft ° may no longer be independent due to the correlations imposed by this 

conditioning. A direct application of the law of large numbers is therefore no longer valid. Suppose, 
in particular, that the initial macrostate Go = G is constrained to be in a small region Bg containing 
a given macrostate y. We will show that Gt converges in conditional probability to some iptiv)', in 
other words, 

,ij.."[o.eB|GeB.l = {;; (4, 

where tjjtiy) is the expectation value of goipt with respect to a new probability measure determined 
by y- 
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In Section ^ we shall consider three values of y, namely ymin, ^max, and j/* = J-^g Afi, for which 
the law of large numbers alone may be used to determine ■0t (y) and prove convergence in conditional 
probability. Later, in Section^, convergence is proven for y G (ymin, ymax) = y° using the theory of 
large deviations, from which a general expression for iptiy) is obtained in terms of the expectation 
oi g o ipi with respect to a suitable canonical measure. 



3 Law of Large Numbers Approach 

We consider first an approach using only the law of large numbers, which will be valid for y ^ y^ :~ 
g and, in certain cases, y — j/min and y = j/max- We begin with the latter two cases. 

Suppose n[{g = j/min}] > and note that, since {G = t/min} = {.9 = 2/min} x • • • x {g = J/min}, 

n 

[ X • • • X A„ I {G = ynun}] = n ^[^'' I ^9 = ymin}], (5) 

i=l 

for any Ai, . . . , A„ in A. Thus, conditioned on {G = 2/min}, Gt is a sample mean of i.i.d. random 
variables with common marginal o ■ ) \ {g = 2/min}]- The weak law of large numbers then 

implies 

for any Borel set i? C F, where 

IptiVmin) / goiptdfl[ - \ {g^y^i^}]. (7) 



X 



A similar result holds for conditioning on {G — j/max}: provided ^[{g — 2/max}] > 0. 

Let us turn now to the case y — y*, where we suppose G Y°. Given a set Bg ^ Y whose 
interior contains y*, we wish to examine the limit 

lim fi"[{Gte B} \{Ge Bs}], (8) 

n — *oo 

for an arbitrary Borel subset _B of F. 

To parallel the large deviation approach of the following section, we consider the convergence of 
the empirical measure -L„, defined by 

1 " 

L„(a;i, . . . ,a;„) := - y^(5:c,, (9) 

1=1 

where is the unit point measure on Xi. Given (xi, . . . , a;„) S X" and C C X measurable, 
L„(xi, . . . , Xn) S 'PiX), and Ln{xi, . . . , x„)[C] is the fraction of particles whose microstate is in G. 
A useful result is that Gt, and hence G, may be written in terms of L„, since 

1 " f 

Gtixi, . . . ,Xn) ^ -y^g{iptiXi)) ^ / g O (fit dLn{xi, . . . ,Xn)- (10) 



Using the above relation, convergence properties of Gt may then be deduced from those of L„. 
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Now, since {X, d) is a separable space, L„ converges almost surely to /j, p. 313] and hence in 
probability as well. Thus, for any Borel set A C 

jim,"[{x„eA}] = {;; J (11) 

Now suppose As is a Borcl subset of V{X) such that fi ^ A^. Using Eq. ([Til), we have that 

jim /'[{L„eA} |{L„gA4] = | (12) 

since fj,'^[{Ln € Ag}] — > 1 as n ^ cx). 

To apply this result to the convergence of Gt, we define the expectation function Eg{P) := 
g dP ioT P eViX) and note that 

1 " 

O i„ = - g O TTi = G. (13) 

Since g is bounded and continuous //-a.e. £^g is continuous at [i in the Prohorov metric topology 
[H 0. (Of course, if P ^ ^, then Eg is continuous at P as well.) Thus, Eg{fi) G B° impHes 
^ G E-'^{B)°, and similarly £'g(Ai) ^ B implies ^ ^ Eg^{B). 

Now, since g o ft is also bounded and continuous /i-a.e., ^t{y*) = Ego^^{p) G i?° implies /i G 

-£^^oVt(^)°' and similarly iptiy*) ^ S imphes /i ^ £'goU(-B)- Setting A = E-^^^{S) and = 
E~^{Bs) in Eq. (p^), we conclude that 

Jim^,"[{G.GB}|{GGB4]-{ J; ;[Jt|fIo, (14) 

where 

iptiy*) := Ego^^ifi) = goipt dfi. (15) 



If /i is an invariant measure, i.e. /i o ^ = /i, we obtain the rather trivial result that Gt converges 
in probability to its equilibrium value, y*, as n — > 00. 

The general problem, wherein y G Y° \ {y*}, cannot be addressed with the law of large numbers 
alone. To see this, consider a set Bs C Y° such that ^ Bg and let i? C y be some Borel set. We 
wish to evaluate the following conditional probability as n — > 00: 

/x"[{Gt G i?} I {G G Bs}] = e E-^ViB)) I {^n e e;'{Bs)}] 

= /."[{L„ G e;,1^{b) n E^\Bs)}] I m"[{£„ g e-^\Bs))\. 



Since — Eg{^) ^ Bs, ^ ^ Eg ^{Bs) and the denominator goes to zero. Since fi ^ Ego^^{B) n 
E^^{Bs) 3 Egoifn{B) n Eg'^{Bs), we see that the numerator goes to zero as well, leaving the limit 
indeterminant. To proceed further requires more detailed information about the rate of convergence 
of each limit, which the law of large numbers alone cannot provide. For this we turn to the theory 
of large deviations. 
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4 Large Deviation Theory Approach 



We have seen in the previous section that the weak law of large numbers is insufficient for evaluating 
limiting conditional probabilities when the initial macrostate is not j/,. (More precisely, this is true 
when we condition on a set of macrostates whose closure does not contain .) Large deviation theory 
provides a tool for evaluating such limits and, through its application, provides an explicit expression 
for tptiy) even when y y*. This approach is a refinement of the weak law of large numbers for 
cases in which the probabilities converge exponentially fast at a rate given in terms of a function 
I, the so-called "rate function." The ground work for this theory was established by Boltzmann 
[ pi] in his study of the asymptotic properties of multinomials and later applied by Einstein in 
his analysis of fluctuations. Recent years have seen great development of this relatively new field 
of mathematical probability, including applications in equilibrium statistical mechanics, stochastic 
processes, and mathematical statistics in . Ruelle and Lanford Q have developed 

similar techniques for studying the equilibrium distributions of dynamical maps, where the (negative) 
Kolmogorov-Sinai entropy serves as a rate function [|l9[. 



In Section 4.1 we define rate functions and the large deviation principle. The main result is 
Theorem ^ regarding convergence in probability for conditional probabilities. In Section 4.2 we take 
up the notion of Gibbs Conditioning, which will allow us to apply Theorem ^ to cases in which 
we condition on an arbitrary initial macrostate y. Our results are similar to those of [p^ , who use 
the stronger r-topology on 'P{X), and follow from an approach adapted from |^0| , who considers 
discrete macrostates. 



4.1 Large Deviation Theory 

For a topological space (X, T), a function / mapping X into [0, cxd] is called a rate function iff 
inf J(X) = and / is lower semicontinuous, i.e. the preimagc /^^([O, a]) is closed for all a G [0, oo). 
A good rate function is one in which these preimages are compact. The property of being lower 
semicontinous guarantees that / attains its infimum on any compact set, from which it follows that 
a good rate function attains its infimum over any closed set m, PP- 4, 308]. Any point, x*, at which 
/(x») = is called an equilibrium point, since it corresponds to a state of maximum probability. 

A sequence {Pn)„^ff of probability measures on the Borel subsets of X is said to satisfy a large 
deviation principle with rate function / iff there exists a sequence (an)„gpf of positive numbers 
tending to infinity such that, for any Borel set A C X, 

-inf/(yl°) < liminf — logP„[A] < lim sup — log P„ [^] < ~mil(A). (16) 

n^oo an rwoo a„ 

A set A for which ini I{A°) ~ inf I{A) is called an I -continuity set. Clearly for such sets we have 

lim — logP„[^] = -inf/(^). (17) 

This result should be compared with the Boltzmann relation, S = log W , relating the thermody- 
namic entropy, S, to a volume, W , in phase space, where fee is Boltzmann's constant. The quantity 
— a„/ serves as a negative entropy, while P„ may be viewed as a normalized volume measure. 

If < inf /(A) < c», Eq. (jl^) implies that P„[A] converges to zero at least exponentially fast. 
Specifically, from Eq. we may deduce the following: Given any Borel set A C X and e > 0, 



6 



then for all n sufficiently large, 



exp[~an{l + e)MI(A°)] < Pn[A] < expha„(l - e) inf /(^)], (18) 

provided inf I(A°) > and inf I{A) < oo. Equality for the lower bound may hold only if inf I{A°) — 
00, where e~°° := 0, while equality for the upper bound may hold only if inf /(A) = 0. 

If, for example, there is a unique equilibrium point a;*, then x^: ^ A implies inf I (A) > 0. Provided 
inf /(^) < oo and taking e = 1/2, say, this implies -P„[A] < exp[— ninf /(yl)/2] for all n sufficiently 
large. If inf /(A) = oo, then A is an /-continuity set and Eq. implies a^^ log Pn[A] — oo. 
Thus, if ^ A, then Pri[^] ^ as n — > oo, and we recover the weak law of large numbers, i.e. 

nj.^P4Ai-{l IJj^J^ (19) 

for any Borel set A C X. 

For our purposes, we would like to show a result similar to Eq. ( p^ for the case in which we 
condition on a suitable initial condition. In particular, we would like to obtain a result analogous 



to Eq. (12). The following theorem, stated in its general form, may be used for this purpose. The 
proof is deferred to Appendix ^ 

Theorem 1 Suppose (Pji)„gN satisfies a large deviation principle with a good rate function I and 
a unique equilibrium point x^. Let B C X be an I -continuity set such that ^ dB and there exists 
a unique xb & B such that I{xb) = inf /(B) < oo. Given any Borel set A C X, 

lim Pn[A\B] = ( J' (20) 

In the next section, we shall consider a large deviation principle for the sequence (/i" o L^^^^^^^ 
of distributions of empirical measures. Conditioning on As = E~^(Bs) will then give rise to a new 
equilibrium probability measure. Pa 7 in general different from /i. We shall show that under these 
conditions Theorem |l| is satisfied, thus establishing convergence of Ln in conditional probability to 
Px. 



4.2 Gibbs Conditioning 

In his 1877 paper, Boltzmann proved that the asymptotically most probable configuration for a gas 
of n particles with a finite number of macrostates is given by a multinomial distribution. Sanov's 
theorem jl^, p. 70], a modern refinement of this classic result, states that the sequence (/i" o L~^)^^^ 
of distributions of empirical measures satisfies a large deviation principle with rate function /^ : 
V{X) [0,00]. Here /^(P) is the (negative) Gibbs entropy of P with respect to /i, defined by 

,^(P)^ r /xfiogf d,, ifp^,, 

1^ 00, otherwise, 

where OlogO :— 0. It can be shown that is a good, strictly convex, rate function p. 240] 
which attains its infimum uniquely at [|l^, pp. 32-34]. 

Given y £Y° and S > 0, let As = E~^{Bs), where, Bs = {y — S, y] when y < y*, Bs — [y,y + 5) 
when y > y^, and Bs = {y* — 5, + 6) when y = y<t. (Note that /i ^ dAs, since y* ^ dBs and 
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Eg is continuous at /i.) We wish to consider the asymptotic behavior of the conditional probabiUty 
in" o L-^)[A\As] = n"[{Ln g A} I {Ln e As}], as n ^ oo, for any Borel set A C V{X). For example, 
if G is the macroscopic average energy of the system, then • \ {G G Bg}] is the microcanonical 
distribution on the "thickened" energy shell with energy y. 

We will show that, conditioned on {G G Bs}, the empirical measures (in)neN converge in prob- 
ability to the canonical Gibbs measure Px , where A satisfies the constraint y = fx 9 d^^A and 

dPxix) := dfiix). (22) 

The normalization factor Z{X) := /x is the partition function, and the quantity ^'(A) := 

logZ(A) is the generalized free energy. Note that, if G is the macroscopic average energy, then 
—k^T'i>{—l/{k^T)) is the familiar Helmholtz free energy at temperature T. 

Denote by 9 the map 9 : [— oo, +oo] Y which associates a given A with a certain value of y 
and is defined by 9{X) := g dPx for A G M, 9{—oo) := j/min, and 9{+oo) :— j/max- The following 
lemmas will be needed. The proofs are deferred to Appendix 

Lemma 1 // ^'"(A) is nonzero for all A € (— oo,+cx)), then the map 9 is invertible. Furthermore, 
y <y* iff >^ < 0, y ^ y* iff ^^0, and y>y* iff X>0. 



Lemma 2 Given y e Y° , let As = Eg ^{Bs) and X = 9 ^(y). Then If^iPx) = inf /^(yli) < oo and 
/;.(Pa) < /m(^) for all P eA^\{Px}, where X = 9-^{y). 



Lemma 3 Given y G Y° , the set As = Eg ^(Bs) is an I fj_- continuity set. 

Using the above lemmas we may deduce that, given y g Y° , As = Eg^{Bs) is such that ^ ^ dAs 
and, for A = 9~^{y), Px is the unique measure in As such that /^(Pa) = hif I{As) = ini I{Ag) < oo. 
By Theorem ^we conclude that for any Borel set A C 7'(X), 

/i"[{in e A} I {L„ e As}] ^ | J| jf f (23) 

In particular, for any Borel set B C Y, 

lirn^,^m.eB}\iGeBs}]^[l ^"^^^^ (24) 

where 

'>Ptiy) ■■= go ipt dPe-i(y). (25) 
Jx 

Note that ipoiy) ~ y since ipQ is the identity on X. Conditioning on {Gfp e Bs} merely shifts the 
time axis, in which case Gt converges to ipt-to iy) in probability. 
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5 The Deterministic Curve 



We have shown that, conditioned on G G Bs, the macrostate Gt at a given time, t, converges in 
probabihty to the expectation value ipt{y), where y is contained in Bs- Hence, of the initial mi- 
crostates consistent with the initial macrostate, y, "most" will be such that the actual macrostate 
realized will be near this value. Now, each microstate, {xi,...,Xn), gives rise to a collection, 
{Gt{xi, . . . ,Xn) '■ t G T}, of macrostates constituting a single trajectory. Likewise, each macrostate, 
y, gives rise to a collection, {ipt{y) ■ t G T}, of expected macrostates, which we shall call the deter- 
ministic curve. This graph represents the asymptotically deterministic behavior of the macrostate. 
In this section we shall investigate in what sense the deterministic curve is representative of a typical 
trajectory and consider several properties of the deterministic curve itself, considered as a function 
of time. 

In general, there may be striking qualitative differences between the deterministic curve and 
a particular macrostate trajectory. Suppose fi is invariant under ipt and fi"'[G^^{Bs)] > 0. The 
Poincare recurrence theorem tells us that for some unbounded sequence ti,T2, . . . of times 

m"[{G,i e Bs, Gr, e B,-, . . .} I {G e S^}] = 1, (26) 

i.e., the macrostate returns infinitely often to a neighborhood of its initial value, y, almost surely. 
Now suppose the map t i— > iptiv) has an attracting set A with domain of attraction D and take 
Bs Q D\U , where [/ is a neighborhood of A. For all t sufficiently large, iptiy) will be in [/, but 
Gt will almost surely fall outside U on the recurrence times ri, r2, . . .. These recurrence times will 
depend upon Bs and ft, of course, but typically increase rapidly with n. On a time scale small 
compared to ti, one then expects the deterministic curve to be quite representative of a typical 
trajectory. On time scales larger than ti, however, the deterministic curve will be qualitatively 
quite different from a typical trajectory; the former converges to an attracting set while the latter 
exhibits quasi-periodic behavior. This highlights the importance of a clearer understanding of the 
correspondence between the very distinct n < oo and n — oo cases. 



5.1 Convergence to the Deterministic Curve 



Consider a finite set, {ti, . . . , tm], of times and for each time ti let Bi QY he & Borel set. The set 
of microstates in which Gt^ G Bi for all i is given by 



fl {Gt. e B,} = i L„ e fl A, , 

1=1 I i=l J 

where Ai = E~^^^ {Bi)- According to Eq. (^3|), with A — Plt^i the limiting conditional proba- 
bility will be zero if Pa ^ CliLi where A = 6~^{y), and it will be one if Pa G (HilLi Since 
the intersection is over a finite number of sets, we may use the fact that (Hl^li = 0^=1 ^'iid 
n"=i — ni=i to obtain the following: 



hm 



fl {Gt. e 5,:} 



{G G Bs} 



0, if ipt, (y) ^ Bj for some j, 

1, if V't.(y) e for alH. 



(27) 



For a countably infinite set of times, we may again conclude that the conditional probability goes 
to zero if V't^ (y) ^ Bj for some value of j, since Hi^i {Gti G Bi} C {Gj^. € Pj}. However, even if 
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iptiiy) & B° for all i and hence Pa £ fli^i^i"! this does not assure us that P\ G (fli^i^i)"- For 
the latter to be true, there must be a neighborhood of P\ which is sufficiently small so that it is 
contained in every A°. Let us consider, then, conditions for which this is true. 

Suppose each Bi is an open interval of radius 5' centered at V'ti {y) ■ Now, each corresponding 
Ai is an open set of probability measures which give an expectation of 5 o Lp^^ in the open ball Bi. 
Loosely speaking, if 5 o (^j. , and hence Ego^p^, , varies too rapidly, then Ai will be small as a result. 
If, therefore, the functions {g o ipti}i^ff somehow limited in how rapidly they may vary, then one 
might expect the sizes Ai, A2, ■ ■ . to he bounded from below, thus giving a nonempty interior for 
their intersection. For a general metric space, {X, d), one way to characterize how rapidly a function, 
/, may vary by its Lipschitz norm, ||/||l, define as follows: 

"^"^ T> dix x') ■ ^ ^ 

Differentiable functions will be Lipschitz if they have bounded derivatives; discontinuous functions, 
such as indicator functions, have infinite Lipschitz norm provided d{x,x') may be made arbitrarily 
small. 

To ensure well-defined expectations we require the functions to be bounded as well, so consider 
instead the bounded Lipschitz norm, || • ||bl, defined simply by ||/||bl ||/||l + ||/||cx;- (For a 
discussion of this norm, see Dudley ^). A set, {/ijjgpj, of functions will be called uniformly bounded 
Lipschitz if there exists a number, K, such that ||/i||BL < K for all i E N. The following theorem 
states that for such observables the macroscopic trajectories converge in conditional probability to 
the deterministic curve on any countable set of times. 

Theorem 2 If {g o ^ti}i^jq o,re uniformly Lipschitz bounded functions, then for any ti,t2, ■ ■ ■, 

lim |Gt. - i;tM\ < 5', G N} | {G G Bs}\ = 1. (29) 

n — *oo 

As noted above, this result may be proven by finding an open ball, Bp{P\, e), about P\ which is 
contained in every Ai. Consider an arbitrary probability measure, P, in Bp{Px, e) and observe that, 
since we have assumed o "y^ti ||bl < K, 

\Eao^^SP)-'l'u{y)\ < sup{|£;/(P)-i?/(FA)|:||/||BL<i^} 
= Ksn'p{\Ef{P)-Ef{Px)\:\\fU^<l} 
< 2Kp{P,Px) <2Ke, 

where the last inequality follows from pp. 310, 322], since {X,d) is a separable metric space. 
Taking e < S'/{2K) shows that Ego^^. (P) G B^ for all P G Bp{Px,e) and hence that Bp{Px,e) C A^. 
Since e is independent of i, we conclude that Px G Bp{Px, e) C (H^^i 

For discrete time maps with suitable observables. Theorem ^ shows that the macrostate trajec- 
tories converge in conditional probability everywhere to the deterministic curve. This result may 
seem surprising, since we have seen that the long-time behavior of a typical trajectory may differ 
radically from that of the deterministic curve. However, this simply means that the time scale on 
which the macroscopic behavior appears deterministic grows rapidly with the number of particles. 

The case in which the set of times, T, takes on a continuum of values is somewhat more problem- 
atic owing to the fact that an arbitrary intersection of measurable sets need not be measurable. If, 
however, {Gt : t G T} is sample continuous on an interval, T, i.e. if i i-^- Gt{xi, . . . , Xn) is continuous 
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for every (a:i, . . . , a;„), then we may consider {Gt : t € T} to be a random process on the metric space 
of bounded continuous functions with the supremum norm. For physical observables this is quite 
reasonable to suppose. Much as in the theory of Brownian motion, we may then consider events of 
the form {sup^gy \Gt — '>pti'y)\ > S'} and their corresponding conditional probabilities as n — > oo. It 
may then be possible to derive a large deviation principle on the set of bounded continuous functions 
on T, much as is done for Brownian motion in Schilder's theorem but here we do not pursue 
this matter further. 

5.2 Properties of the Deterministic Curve 

In the previous section we considered the probabilistic convergence of macrostate trajectories to the 
asymptotic deterministic curve. In this section we consider properties of the deterministic curve, 
{iptiu) ■ t G T} as a function of t in its own right. Of course, we do not expect these properties 
to necessarily carry over to those of typical trajectories. Nevertheless, they do give a clue to the 
behavior of these trajectories for large n and relatively small t. 

If gi is a discontinuous function, then specific realizations of {Gt : t E T} will also be discontinuous. 
Since G is the average of a bounded function, however, the size of these discontinuities will vanish 
as n oo. It is then reasonable to suppose that the deterministic curve, {iptiu) '■ t € T}, will be 
continuous in t. 

Recall that g and ipt are continuous /i-almost everywhere. Since g is bounded fj,-SL.e., clearly 
g o (ft is so as well. If we suppose t i— > g{(pt{x)) is continuous for fi-a.e. x E X, then of course 
g{ipt'{x)) — > g{tft{x)) ast' t for /i-a.e. x E X. By Eq. ( ^5|) and Lebesgue dominated convergence, 
this implies 

lim -0*' (2/) = lim / g o ipf, dPx ^ / g o (p^ dPx ^ iJtiy) , (30) 
t t J ^ J ^ 

where A = 9^^{y). Thus, t 1— > ijjt{y) will be continuous provided y £ Y° and t g{ipt(x)) is 
continuous for /i-a.e. x £ X. 

Now suppose only that t i— > (pt{x) is continuous for /i-a.e. x £ X and that ipt is /i-nonsingular, 
i.e. /i o (^^^ -< /i, for all t £ T. Since g is continuous /x-a.e., there exists an open set. A, with 
null complement such that g, restricted to A, is continuous. Furthermore, since ift is continuous 
/Li-a.c, there is a set, B, with null complement such that t 1— > (pt{x) is continuous for all x G B. 
Thus, if X g (p^^{A) n B, then the composite map t i— > Lpt{x) i— > g{(pt{x)) is continuous. Since (pt is 
/x-nonsingular, 

fi[X \ {^;\A) n B)] < fiiX \ pi\A)] + fi[X \ B] - fi[^t\X \ A)] - 0. 
Thus, 1 1— > g{(pt{x)) is continuous for /i-a.c. x G X, and wc conclude the following: 

Theorem Slfti-^ y^tix) is continuous for fj,-a.e. x £ X and (ft is ii-nonsingular for all t £ T, then 
t ^ ^t{y) is continuous for all y G Y° . 

For many physical systems, the dynamical law ipt is not only invertible but also time reversible 
in the sense that ip^^ = f^t- In such cases, we shall say that ipt is time reversible. Time reversibility 
often appears in physical systems which have the added property that ipt o R R o ip_f for some 
involution R = R~^, a property referred to as time reversal invariance. When such an R exists, 
every particle microstate x has a mirror point R{x) such that ft{x) — R{ip-t{R{x))); hence, the 
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trajectory of x is mirrored by the trajectory of R{x) with the direction of time reversed. We may 
then partition X into disjoint sets Xo — R{Xo), Xi and R{Xi). If the observable and a priori 
measure are invariant under R, i.e. g = g o R and ^ o R^^ — ^, then a typical initial microstate 
(xi, . . . ,a;„) € {G e Bs} will include roughly equal numbers of points from Xi and R{Xi). On 
this basis, one expects the trajectories {Gf (xi, . . . , Xn) : t > 0} and {Gt(xi, . . . , Xn) : t < 0} to be 
similar, though not identical, when n is large. This suggests that the deterministic curve should be 
perfectly symmtric in time. Indeed, it is easy to see that, if Lpt is time reversal invariant under R 
and both g and /i are invariant under R, then 



X 



Z{\) 



e 



^XgoR 



X 



Z{\) 

Thus, time reversal invariance is sufficient for time symmetry of the deterministic curve, provided 
both the observable and the a priori measure are invariant under R. 

Suppose 5 is a simple function of the form g = Y^^iO-i^d^ where each is distinct and 
Ci, . . . , Cm form a partition of X. Now, the deterministic curve for this g is given by 

mm m 

My) =Y.Y. a.e^"^7i[^r'(a) n C,] / e^">[Cfe], (31) 

i=l j=l k=l 

using Eq. (p5|). Since we have supposed goR — g, notice that R^^{Ci) = Ci for each i. Furthermore, 
since /j, is invariant under i?, 

fi[ipr\c,)nc,] = f4R-\^^'{c,))nR-\c,)] 

= fi[(pt{C,) n Cj]. (32) 
Furthermore, if /x is invariant under ipt , then 

fi[^;\c,)nc,]^^i[c,n^tic,)]. (33) 

The system therefore exhibits strong detailed balance in the sense that 

fi[ipt\C,) n Cj] = ^i[Ci n iPt\Cj)] for aU i, j, and t. (34) 



Conversely, if we suppose only that tp^ — tp-t exists and preserves fj,, then Eq. (34) implies 
time symmetry of the deterministic curve when g is simple. For example, suppose g — ailc^ +0-2^02 
has only two possible states, with Gi = C and C2 — X \ C . For any ipt which preserves /i, 

^l[^i\x\c)nc] = ^,[C]-^i[^i\c)nc] 

= ^l[p;\c)]-^i[cn^i\c)] 
= fi[{x\c)nip^'{C)]. 

Thus, g exhibits strong detailed balance. The corresponding deterministic curve will be time sym- 
metric provided ip^^ = tp-t exists. In general, macroscopic averages of two-state single-particle 
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observables are always time symmetric, provided the dynamics are time reversible and preserve the 
a priori measure. A system which exhibits strong detailed balance need not, however, be time 
reversal invariant. (Consider C = [0, 1] and (pt{x) ^ x + 1 on W. The only possible R is R{x) = —x, 
yet clearly i?(C) 7^ C.) Thus, time reversal invariant systems form a proper subset of all systems 
exhibiting time symmetry. 

Finally, let us consider the asymptotic behavior of the deterministic curve for large t. Suppose 
once more that fi is invariant under ipf The collection, {1^94 : t £ T}, of maps is said to be mixing 
with respect to fi if, for any measurable subsets A and B oi X, 

\im ^i[^i\A) n B] = fi[A] ^i[B]. (35) 

t— >oo 

From this definition and the fact that Px -< fj,, it follows that j2l], p. 72] 

lim / goiptdPx= / g = y^. (36) 

If ipt is time reversible, then this result clearly holds in the limit t —00 as well. By its definition, 
mixing is both a necessary and sufficient condition for ipt{y) to converge to Eg{ii) — for any g. 
In cases where the dynamics are not mixing, however, one may still have convergence in time for a 
restricted set of macroscopic functions. 

As we have seen in our discussion of Poincare recurrence, convergence in time for iptiv) need not 
imply convergence in time for Gt- In fact, such behavior is often quite unlikely. Nevertheless, on a 
short enough time scale, a scale which increases with n, and for large enough n, both iptiv) and Gt 
will appear to converge to the same limit along the same trajectory. 



6 Fractional Occupations 



Consider the case g = Ic, for which Gt is the fraction of points in C at time t. Since e^^ 
lx\c + G'^lcj the corresponding partition function is 



Z(A) =/i[X\q + eV[C]. 
Recah that 61(A) = *'(A) = e^fi[C]/Z{X) for A e (-00, 00); thus, for y e (0, 1), 



9-\y)= log 



y ^l[X\C] 



^,[C] 1-y 



Using Eqns. (|25|), (|37|), and (|3§), we find 

My) = y M[^r'(C) \C] + {i~y) /i[^r'(C) \x\C]. 



(37) 



(38) 



(39) 



This result is easily understood as follows: The expected number of points in C at time t will be 
the number of points starting in C times the fraction of those points expected to be in C at time t 
plus the number initially outside C times the fraction of those points expected to be in C at time t. 

The derivation of Eq. ( ^Of ) was valid for y e (0, 1); using Eq. (Q) we can see that it is valid for 
y e {0, 1} as well. Consider conditioning on {G = 1} and note that this is equivalent to conditioning 
on C", since G{xi, . . . ,Xn) — 1 iff G C for all i. The conditional distribution of Gt is therefore 
binomial with parameters n and /i[(/5^"'^(C) \C\. By the strong law of large numbers, Gt, conditioned 
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on {G = 1}, converges almost surely, hence in probability, to fj,[ipY^{C) \C]. A similar argument 
shows that Gt, conditioned on {G = 0}, converges to iJ.{ip^^{C) \X \ C]. We note in passing that 
the distribution of \/nGt is asymptotically Gaussian by the central limit theorem, provided these 
transition probabilities are neither nor 1. Thus, deviations from the deterministic curve go to zero 
as when y G {0, 1}. 

In Fig. we have plotted 6 versus A for three different values of ^.[G] — 9{Q). The strict 
monotonicity of the curve implies 9 is invertible, in accordance with Lemma |l]. If, for example, 
(7 = 1(7 is the energy of a two state particle, with energies and 1, then A = — (fceT')"^, where 
T is the absolute temperature and fee is Boltzmann's constant. An initial macrostate with energy 
density, y, near zero corresponds to a heavily populated low energy state and hence a low, positive 
temperature (A ^ 0). If y = /i[C], then the initial macrostate is at its a priori most likely state; if ^ 
is invariant under ipt, this corresponds to macroscopic equilibrium. Note that A — > 0~ corresponds to 
T —t +oo; at high temperatures the particles are uniformly distributed, with respect to /i, between 
the two energy states. For macrostates beginning above equilibrium, i.e. y > fJ-[G], there is an 
effective population inversion similar to that found in systems of weakly coupled magnetic dipoles. 
Systems with small negative temperatures (A 3> 0) tend to have more densely populated high energy 
states, while large negative temperatures (A ^ 0^) again correspond to near equilibrium conditions. 
The single parameter fi[C] determines the asymmetry between states below equilibrium and those 
above. Thus, we see that A plays a role in defining initial macrostates which is analogous to that of 
temperature in defining equilibrium states. 

Notice that the function 6 is associated only with the initial macrostate. The time-evolved 
behavior of fractional occupations is contained in the two transition probabilities in Eq. (|3S|). In 
general, these may be difficult to determine. For the baker map, 0, a discrete time map on the unit 
square ||2^, these may be computed for rectangular cells with Lebesgue measure This allows 
one to calculate iptiy) for several time iterations and to compare this with Monte Carlo simulations. 
Although an abstract map, the baker map shares many of the relevant features of more realistic 
Hamiltonian dynamical systems. In particular, it is Lebesgue measure preserving, mixing, and time 
reversal invariant in the sense that 4> o R = R o for R(x, y) = (y, x). 

In Fig. H we have plotted '0t(y) for t = — 10, . . . , 10 and y — 0.4 using the baker map. (Since 
the map is invertible, negative times refer to iterations of the inverse map.) The cell was chosen 
arbitrarily to be C = [0.2, 0.6) x [0.0, 0.5), for which /z[C] = 0.2. The values of ipt{y) are connected 
by straight solid lines in the figure. For comparison, a single realization of an ensemble of n = 50, 000 
points was generated which satiosfied the initial macrostate y ~ 0.4. This was done by drawing the 
first [nyj points uniformly from C and then drawing the rest from outside C. (Here, "uniformly" 
means with respect to Lebesgue measure.) Once generated, the known form of the map (ft :— (p* 
was used to time evolve the initial ensemble for each value of t. The fractional occupation, Gt, was 
then computed for each time-evolution of the initial ensemble and is indicated by a solid dot in the 
figure. 

The qualitative behavior of ^Jt{y) in Fig. |^ is particularly notable in two regards. First, it is 
readily observed that the plot is symmetric about t = 0; in particular, ip-tiu) — 4't{y) exactly. 



while G-t and Gt are only approximately equal. This, as was shown in Section 5.2, is a general 
property of two-state systems for which ipt is /i-measure preserving and time reversible. Hence, there 
is no distinction between the forward and reverse time directions. The second observation is that 
Iptiy) ~^ A'l^] as i ^ ±oo, which is a direct consequence of the mixing property. Thus, the baker 
map provides a simple model of an equilibrating macroscopic quantity. 

A second comment is that, while at each given time, t, the most probable macrostate is iptiy), 
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for any finite n the set {'4't{y) : t £ Z} is itself an improbable realization of {Gt : t G Z}. This may 
be understood by observing that, given e > 0, we have \tpt{y) — A^[C]| < £ for all \t\ sufficiently large, 
yet, by Poincare recurrence theorem, \ipt (y) — /x[C] | > e for infinitely many values of t, almost surely. 

The family of macroscopic maps, {ipt '■ t € 7j} , does not form a group, or even a semigroup, in 
contrast to the family of microscopic maps, {ip* : t e Z}. Thus, while x = ip^t{^t{x)), in general 
y ij^-titptiy)) = "fptifptiy)), since ipt is time symmetric. Furthermore, {ipt : ^ > 0} does not even 
form a semigroup, since this would imply |'i/'t(y) — y*\ > \''Pt+s{y) — y*\ for all s,t > 0, i.e. that 
all future macrostates are closer to equilibrium than their predecessors. To understand this note 
that, while ^pt+siy) describes the state of an observable at time t + s whose value was y at time 
zero, ipsiiptiy)) describes the state of an observable at time t + s whose value was iptiy) at time t. 
The latter corresponds to a rerandomization of the original distribution, which removes correlations 
that would otherwise be preserved by the dynamics and causes disagreement with the actual time 
evolution of the observable. 

The baker map is a discrete time map, whereas the dynamics of physical systems are given by 
continuous time flows. A simple example of this type of system is the rotation map on the unit 
square, which is given by 

(2^1, 2:2) = (xi +iiJit, X2 +iiJ2t) mod 1. (40) 

A plot of ipt{y) for LJi = \/2, 0^2 — \/3, and y = 0.4 is given in Fig. ||. We again note the perfect 
symmetry about t — found previously in the baker map. Unlike the baker map, however, the 
expectation value does not converge to an asymptotic value but varies quasi periodically in time. 
(The flat portions of the graph occur when ip^^{C)r\C is empty and hence only 0.6x (0.2/0.8) = 0.15 
of the remaining points are expected to be in C.) A particular realization using n = 5, 000 is plotted 
for comparison. 

Near t — 0, the deterministic curve is linear with a slope pointing toward the equilibrium value, 
— 0.2, as |i| increases, a behavior which holds generally for any value of y. Thus, the initial 
tendency of the system is to move monotonically toward the equilibrium value. Such behavior has 
been ascribed to Boltzmann's H-function js) , though extrapolated to include times far from zero as 
well. As we have seen from the maps considered here, this extrapolation need not be valid. Since the 
H-function is computed from fractional occupations, though, it seems reasonable that monotonicity 
toward equilibrium should hold at least for t near zero. 
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7 Discussion 

We have considered a general class of systems composed of identical constituents, here called "parti- 
cles," that are dynamically noninteracting. For the microstates of the collective system, we supposed 
there is an a priori measure, typically an invariant measure, that describes the distribution of these 
microstates in the absence of any restrictions based on the given macrostate. We further supposed 
that the particles are statistically independent, that any correlations among them arise only by the 
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need to satisfy the given macroscopic constraint. No attempt was made to justify these assumptions 
at a more fundamental level, though we believe they are quite reasonable for many physical systems. 

What we have shown is that the time evolution of a particular macroscopic variable, namely the 
average over certain real- valued single-particle functions, is such that it converges in a probabilistic 
sense to a well defined curve as the number of particles tends to infinity. Specifically, we have 
derived a map V't such that, if the macrostate at time is constrained to be near a value y, then the 
macrostate at time t will be in a given neighborhood of V't (y) with a probability approaching one as 
the number of particles tends to infinity. The map V't was defined in terms of an expectation with 
respect to a canonical distribution in which y plays the role of an average energy in the familiar 
thermodynamic formalism. The restrictions on the single-particle function were that it be bounded 
and continuous almost everywhere in a sense specified by the a priori measure. We found that the 
family of macroscopic maps, {ipt '■ t € T}, in general forms neither a group nor a semigroup, even if 
the family of microscopic maps, {tpt '■ t G T}, has this property. 

Having established this basic convergence result for a given time, we then considered how well 
the deterministic curve, the graph of ^t{y) versus t, represented the behavior of a typical realization 
of the macrostates over all time. We found that the two may differ qualitatively quite substantially; 
while there may be good agreement on a finite set of selected times, there will typically be times at 
which they differ substantially. This was particularly true of mixing systems, for which ■0t(y) always 
converges in the long time limit, while a typical trajectory exhibits recurrences. Under some more 
restrictive conditions we proved convergence on any countably infinite set of times, but even then 
recurrences are possible when n is finite. 

With these caveats on the correspondence between the finite and infinite particles cases, we con- 
sidered some general properties of the expectation curve as a function of time. We found that, despite 
the fact that the macrostates may evolve discontinuously, the deterministic curve may be continu- 
ous in time. We also found that, for systems which are time reversal invariant, the deterministic 
curve is symmetric in time about t = 0, the point at which conditioning of initial macrostates takes 
place. These properties were then related to familiar geometric properties attributed to Boltzmann's 
H-curve. 

We have not considered extensions of these results to macroscopic variables in, say, M"*, which 
would involve issues of convexity that make the extension nontrivial. The general problem of inter- 
acting particles poses a greater difficulty and requires a significant change of methodology, though 
we conjecture that similar results will hold if tptiy) is defined as a limit of n-particlc expectations. 

A Proof of Theorem [I| 



PROOF. It suffices to consider xb ^ A since, if xs G A° , then xb <^ X \ A° = X \ A. 

Since x^, ^ dB, either € B° or x^, ^ B. Suppose the former. By Eq. jig]), Pn[B] — > 1 as 
n ^ oo, which impHes Pn[A \B] lim„^oo -Pn[^]- Now, Pn[A] ^ if x* ^ yl, while Pn[A] ^ 1 if 
£ A°. Since xb — x^, for this case, the result is proven. 

Since ^ B, < inf /(B) < inf By Eq. (|l8|) we have, for any e > 0, 

Pn[B] > cxp[-a„(l + £)inf /(B°)] > 
for all n sufhciently large; thus, Pn[A \B] is well defined. Suppose further that inf/(^ni3) < oo. 
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From Eq. (OS) we may also deduce that for all n sufRcicntly large, 



Pn[A\B] < 



exp[-a„(l - e)MI{An B)] 
exp[~a„(l + £)inf 



exp[-a„(l -e)inf /(An S) +a„(l + e)inf 



Now suppose xb ^ A. To show that Pn[A jS] ^ 0, it will sufhcc to show that we may choose e 
such that 



we see that it wiU suffice to show that inf I{A n B) > ini I{B°). (Note that, since inf I{B°) > 0, the 
denominator in the above inequality is indeed nonzero.) 

We have assumed B is an /-continuity set, so miI{B°) = miI{B) = I{xb) < oo. Now, 
if An B = then inf I{A n B) = oo > inf /(S) = inf/(/3°) and we are done. Suppose that 
AOB 0. Then there exists an a: £ ADB such that I{x) = inf/(An/3), since / is a good 
rate function. Notice that xb ^ AD B, since ADBCAdBCA and xb ^ A. Clearly, then, 
xb ^ X. Since AO B C B a,s well, inf /(A O B) > inf I{B), or, equivalently, I{x) > I{xb)- Equality 
cannot hold, however, since, if that were the case, then I(x) would equal inf /(/?), in violation of 
the assumed uniqueness of xb- Therefore, inf /(An B) > miI{B) — inf /(/3°). 

Now suppose inf I{A n B) = oo instead. By Eq. (|l^), this implies P„[An/3] ^ 0. Since Pn[B] > 
for all n sufficiently large, this implies Pn[A\B] ^ 0. □ 

B Proof of Gibbs Conditioning Lemmas 

The proofs of lemmas |l| and || are adapted from those of Ellis |^ , who considers the case in which g is 
a simple function. The extension to a general bounded measurable function is similar but not trivial. 
Dembo and Zeitouni p. 294-7] prove lemma ^ for the r-topology, for which Eg is continuous for 
any bounded g. The proof given here applies for the weaker Prohorov metric topology. 

B.l Proof of Lemma |1| 

PROOF. We first note that, since A i-^ e'*'^'^^-' is bounded and differentiable to all orders for /i-a.e. 
X E X,hy Lebesgue dominated convergence and are well-defined and continuous. Specifically, 
^''(A) = J^g dPx and ^"(X) = g"^ dPx - {J^ g dPx)"^ > for A e M. By assumption ^'"(A) is 
nonzero, so in fact \E'"(A) > and hence increases monotonically. 

We now show that 'I''(A) j/max as A ^ oo. For A > 0, note that 



(1 - e) inf /(A n B) - (1 + e) inf I{B°) > 0. 
Since this means we must choose e small enough so that 



inf/(An B) - inf/(B°) 



inf/(An B) +inf/(B°)' 



I*' (A) - y,„ax| 
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where v := ^ o g ^. Now, for any ^ > 0, 

-5 



|*(A)-ymax| < ' 



< lymin - ymaxi i^[(ymin, ymax - S)] + 



\S r/ TM , ^e^f"-" J/[(ymax - <5,ymax)] 



eA(ymax-5) I'Ky^^^ - S, ?/max)] 
< lymin - ymaxI i^[(ymin, ymax)] C"^'' + S . 

For A > 1, take^ = A"! log(log(A)) and note that e^^"^ = l/log(A) andcJe^'' = A-Mog(A) log(fog(A)). 
The latter two terms vanish as A — > oo, as may be readily verified by L'Hospital's rule. 

A similar argument shows that ^''(A) j/min as A ^ — oo. Continuity then implies that 5*' is 
surjective onto Y. Thus, 6 is invertible with 9~^{y) := {'^')^^{y) for y e Y° , 0~^(yinin) := — oo, and 
^~^(ymax) := +0O. Since 6 increases monotonically and 0(0) = y*, A > implies y = 9{X) > y*, 
while A < implies y = 9{X) < y*. Since 6 is invertible, we have conversely that y > y* implies 
A > 0, while y < y* implies A < 0. Clearly, A = if and only if y = y*. □ 



B.2 Proof of Lemma H 
PROOF. Since Px we have 

= X [ gdPx-^iX)=Xy-^iX). 
Jx 

Since g is bounded, ^'(A) — log e^^ > —oo and hence I^{P\) < oo. 

Now let P eA^\{Px}. UP ^ IX, then I^{P) = oo and /^(Pa) < It.{P)- Consider then P < ^i. 
Using the chain rule, dP/d/i = (dP/dPA)(dPA/dAi) ||, pp. 265-6], we find 

'^^'^ ^ /.^°^^^^^/.^°^^^^^/.^°^^^^ 

" ^''^ ^ Ix^^'^W)'^^^ ^"^^ (P) + A ^ 5 dP - *(A) 

> A / .gdP-'J'(A), 
Jx 

where we have used the fact that Ip^ (P) > since P ^ Px- 

Since P ^ fj,, Eg is continuous at P; thus, P E As implies Eg{P) = Jx 9 dP E Bg. li y < y^, then 
Bs = [y — S, y] and A < by Lemma 0. Now, A < and Jx 9 dP < y imply X J^g dP > Ay. If, on 
the other hand, y > y*, then Bs — [y, y + S] and A > 0, while Jx 9 dP > y, so again XJx 9 dP > Ay. 
Finally, if y = y* then A = and X J^g dP > Ay holds trivially. Thus, for all P E As, 

UPx) = Ay - *(A) <X [ gdP- *(A) < /^(P). 

Since Px & As As, this completes the proof. □ 
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B.3 Proof of Lemma |^ 



PROOF. Suppose y < y^, and let A„ = A — 1/n, where A = 9^^{y). (A similar argument may be 
applied if y > y*.) Since 9 is strictly monotonic, Eg{P\^) = 0{\n) < 0{X) = y. We will first show 
that Eg{P\^) S Bg for all n sufficiently large and that Eg{P\^) Eg{P\). Since Eg is continuous 
at Pa, it will suffice to show that Pa„ P\ in P- 

Now, convergence in p is equivalent to the weak convergence of Pa„ to Pa [^7 P- 310] . Thus, let h be 
an arbitrary bounded continuous function on X and define F such that P(A') = J-^h e'^ ^ /Z{\') d/z 
for A' e M. Since the integrand is bounded and continuous for fi-a..e. x, it follows that F is continuous 
everywhere and P(A„) — s- P(A). This proves weak convergence and hence convergence in p. 

We have shown that Eg{P\^) e P| for all n sufficiently large. From this it follows that Pa„ S 
AgCAs and hence I^_i{P\„) = \nEg{Px^) - ^'(A„) > /^(Pa) for aU n sufficiently large. Now, 
miI^{A°s) > inf so /^(PaJ > inf/^(^^) > inf/^(^) = /^(Pa). Since Eg{PxJ ^ ^^(Pa) 
and ^'(A„) ^{X) as A„ A, it is clear that /^(Pa,.) ^ Ip.{Px)- Hence inf /^(A^) = MIf^{As). □ 
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Figure 1: Three plots oi y = 9{X), where g = Ic- The value at A = is given by /i[C]. 
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Figure 2: Plot of the fractional occupation of C versus time for the baker map with C = [0.2, 0.6) x 
[0.0, 0.5) and y = 0.4. Straight lines arc drawn between the values of ij'iiy) for each integer value 
of the iteration time t. The solid dots are the values of Gt for a single realization of an ensemble of 
n = 50, 000 points. The error bars are 95% confidence intervals. 
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Figure 3: Plot of the expected fractional occupation of C versus time for the ergodic rotation map 
with C — [0.2, 0.6) X [0.0, 0.5) and y = 0.4. A particular realization using n = 5,000 is plotted for 
comparison 
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